Skip to content

Conversation

lionelkusch
Copy link
Collaborator

Update the model of CFI, PFI and LOCO for API 2.

@lionelkusch lionelkusch added the API 2 Refactoring following the second version of API label Sep 2, 2025
Copy link

codecov bot commented Sep 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.19%. Comparing base (d159dca) to head (3c52789).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #372      +/-   ##
==========================================
+ Coverage   98.10%   98.19%   +0.09%     
==========================================
  Files          22       22              
  Lines        1159     1222      +63     
==========================================
+ Hits         1137     1200      +63     
  Misses         22       22              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@jpaillard jpaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good but the diff seems very large for this small change.
Is there a reason for all the other modifications?

@lionelkusch
Copy link
Collaborator Author

I reorganize a bit the parameter in the init and move the docstring to the class because in all the other classes, I plan to do this.

By looking into more details, I miss some parts being added. I will add it and ask you to review it after. Sorry for it.

Copy link
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is definitely an improvement, thx.

).pvalue
return self.importances_

def fit_importance(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it disturbing that fit_importance has a behavior that is quite different from simply calling fit, then importance.

  • Could we add a check to the .fit() method to ensure that the estimator is fitted, and if not, fit it.
  • Could we allow for passing a list of fitted estimators matching the number of splits? That could typically be relevant for users willing to pass DL models, trained before, through skorch for instance.
  • If the models are not fitted, can we store them? in estimators_ for instance? It is useful to check the predictive performance in addition to the importance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Could we add a check to the .fit() method to ensure that the estimator is fitted, and if not, fit it.

From my point of view, I don't think because the goal is to do the importance into the cross validation, like in the example of plot_model_agnostic__importance.

* Could we allow for passing a list of fitted estimators matching the number of splits? That could typically be relevant for users willing to pass DL models, trained before, through skorch for instance.

This also requires having the index of the cross validation. At this point, it's better for the user to do the loop with themselves.

* If the models are not fitted, can we store them? in estimators_ for instance? It is useful to check the predictive performance in addition to the importance.

Yes, I will add this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the modification, tell me it's ok.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, there are too many patches that are not optimal and would be avoided by creating a dedicated BasePerturbationCV:

  • The computation of p-values by taking the mean over folds is not valid
  • A big benefit of model-agnostic approaches (LOCO, CFI...) is to support DL models. However, it is not reasonable to force the training of DL models in the 'fit' of Hidimstat's methods. We should allow passing a list of fitted estimators to support this use case.

Copy link
Collaborator Author

@lionelkusch lionelkusch Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the DL?

I agree that this is not an optimal approach. However, there will need a redesign of the usage of CV and the management of the estimator. @bthirion doesn't want that CV is a parameter of fit_importances and for the moment, the estimator requires to be fitted before usage.

I don't see the point of having another class BasePerturbationCV if it's only modifying the fit_importances.
Passing the list of fitted estimators and a CV can be an idea but there are difficult to assert the link between these two objects.

  • The computation of p-values by taking the mean over folds is not valid

Do you have a better solution?
I was thought of using the function aggregate_pvalue but I don't know if it's correct in this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DL: Deep Learning

Copy link
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the progress. Please find a few suggestions enclosed.

Copy link
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're almost there.

Attributes
----------
features_groups : dict
Mapping of feature groups identified during fit.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer accurate IIUC.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?
It's still accurate in this version of the code.

Copy link
Collaborator

@jpaillard jpaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.
I agree with the goal of the modifications but I believe that it is not optimal to implement the CV by simply patching the current class, we need a dedicated class BasePerturbationCV

Comment on lines +270 to +272
self.importances_ = np.mean(self.importances_cv_, axis=0)
self.pvalues_ = (
None if self.pvalues_cv_[0] is None else np.mean(self.pvalues_cv_, axis=0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks problematic:

  • The p-value of the CV estimator is computed over the k test statistics, where k is the number of folds. So, self.pvalues_cv_` should we 1d. Even if it were 2d, taking the mean of p-values is not in general a p-value.
  • I see the problem that leaving self.pvalues_ to None will leave the instance "not-fitted" for me this calls for creating a sub-class BasePerturbationCV

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I see the problem that leaving self.pvalues_ to None will leave the instance "not-fitted" for me this calls for creating a sub-class BasePerturbationCV

This is not a problem because pvalues are not possible to be computed by all methods, such as LOCO.
The check is based only on importances_

).pvalue
return self.importances_

def fit_importance(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, there are too many patches that are not optimal and would be avoided by creating a dedicated BasePerturbationCV:

  • The computation of p-values by taking the mean over folds is not valid
  • A big benefit of model-agnostic approaches (LOCO, CFI...) is to support DL models. However, it is not reasonable to force the training of DL models in the 'fit' of Hidimstat's methods. We should allow passing a list of fitted estimators to support this use case.

Comment on lines +205 to +207
self.pvalues_ = ttest_1samp(
test_result, 0.0, axis=1, alternative="greater"
).pvalue
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the issue #48 mentions, do I should propose a better way to compute the pvalue?

If you want to use the function as parameters:
Do you have some suggestions for the signature of the function of it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest something similar to scikit learn's metric: support both strings: 'ttest', 'wilcoxon', 'corrected-ttest' ... and functions lambda x: ttest_1samp(x, 0.0, axis=1, alternative="greater")[1]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A signature is more like this:
test(diff_loss) -> pvalue

Do you think that losses - mean_losses has parameter is enough or do we need more information?

Copy link
Collaborator

@jpaillard jpaillard Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature you describe looks good. However, I think it is nice to also support passing a string for classical tests. That would save to the user the process of defining a function that follows the signature described while fixing the other parameters of the function (e.g. axis=1, alternative="greater" ...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like to have strings because I find difficult to manage but in this case, it can be interesting.
I try to add a function for it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point is that ttest_1samp is initially a scipy method. Keeping a similar API helps users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API 2 Refactoring following the second version of API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants