Add Partial Dependence Plot #318

lionelkusch · 2025-08-08T17:12:13Z

This is a method of marginal method for computing importance.

I based this PR on the PR #220 for the API and PR #265 for the testing tools.

The figures can be improved with some suggestions.

The main limitation of this implementation is the limitation of the effect of one feature. Scikitlearn proposes to go to 3 features but the importance score can be tricky to compute for more than 1 feature. I decided not to increase the number of features for the moment and avoid problems with the importance.

This method can be a good example to improving the other methods because it is based on the implementation of scikit learn which consider a lot of more cases than the basic ones.

I literally copied the code of scikit learn and their example.
What is the best way to manage license issues? @bthirion

Reference:
Original method: Friedman, Jerome H. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” The Annals of Statistics 29 (5): 1189–1232. https://doi.org/10.1214/aos/1013203451.
Extension to Variable of Importance: Greenwell, Brandon M., Bradley C. Boehmke, and Andrew J. McCarthy. 2018. “A Simple and Effective Model-Based Variable Importance Measure.” arXiv. https://doi.org/10.48550/arXiv.1805.04755.
Implementation:

codecov · 2025-08-11T09:33:19Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.19%. Comparing base (bafa4ed) to head (1a117a4).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #318      +/-   ##
==========================================
+ Coverage   98.10%   98.19%   +0.08%     
==========================================
  Files          22       22              
  Lines        1161     1163       +2     
==========================================
+ Hits         1139     1142       +3     
+ Misses         22       21       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bthirion · 2025-08-12T08:39:25Z

This is a method of marginal method for computing importance.

I based this PR on the PR #220 for the API and PR #265 for the testing tools.

The figures can be improved with some suggestions.

The main limitation of this implementation is the limitation of the effect of one feature. Scikitlearn proposes to go to 3 features but the importance score can be tricky to compute for more than 1 feature. I decided not to increase the number of features for the moment and avoid problems with the importance.

This method can be a good example to improving the other methods because it is based on the implementation of scikit learn which consider a lot of more cases than the basic ones.

I literally copied the code of scikit learn and their example. What is the best way to manage license issues? @bthirion

No worries, you can copy sklearn's code.

bthirion · 2025-08-17T20:39:17Z

Can you make this PR on top of another one that allows to make readable diffs. Otherwise, it's not possible to work on it.
Best,

lionelkusch · 2025-08-27T17:44:44Z

Can you make this PR on top of another one that allows to make readable diffs. Otherwise, it's not possible to work on it. Best,

done

bthirion

I just had a look at the example so far.

examples/plot_partial_dependence.py

Co-authored-by: bthirion <[email protected]>

bthirion

A first pass on the pdp module.

src/hidimstat/marginal/partial_dependence_plot.py

bthirion · 2025-08-30T07:16:05Z

src/hidimstat/marginal/partial_dependence_plot.py

+    Notes
+    -----
+    Based on scikit-learn's _grid_from_X implementation:
+    https://github.com/scikit-learn/scikit-learn/blob/c5497b7f7eacfaff061cf68e09bcd48aa93d4d6b/sklearn/inspection/_partial_dependence.py#L40


Can't we simply import it ?

No, I need to modify it to get the ICEs.

src/hidimstat/marginal/partial_dependence_plot.py

bthirion · 2025-09-01T19:05:51Z

What is the best way to manage license issues? @bthirion

You can copy BSD code without constraint. We should however acknowledge the origin of the code in the documentation.

Why didn't you simply add a light wrapper on top of sklearn's function ?

lionelkusch · 2025-09-02T09:23:20Z

What is the best way to manage license issues? @bthirion

You can copy BSD code without constraint. We should however acknowledge the origin of the code in the documentation.

Why didn't you simply add a light wrapper on top of sklearn's function ?

The scikitlearn implementation is not adapted to compute variable importance because they don't provide access to ICE. Furthermore, the current implementation is limited to one feature at a time instead of doing several at once.

Moreover, the plot of the scikitlearn presents some inconvenient, as I mentioned in issue #51.

jpaillard

It looks good overall but the example is a bit lengthy.
Could we simplify by, for instance, using only the MLP or boosted tree?

examples/plot_partial_dependence.py

Co-authored-by: Joseph Paillard <[email protected]>

add show

6dc98da

lionelkusch force-pushed the PR_pdp branch from 2900409 to 6dc98da Compare August 27, 2025 17:44

lionelkusch marked this pull request as ready for review August 27, 2025 17:44

bthirion reviewed Aug 27, 2025

View reviewed changes

Apply suggestions from code review

523d9f4

Co-authored-by: bthirion <[email protected]>

bthirion reviewed Aug 30, 2025

View reviewed changes

lionelkusch added 10 commits September 1, 2025 14:07

add blank line

9d6b3b8

fix name of the class

17e09bd

change mane method

9e83022

ranem a variable

2649845

rename some variable

f74389a

change a name of parameter

f728013

chnage position of n_jobs

51f2a01

fix test

0e51228

fix error in generating example

3c46a83

fix retrocompatibility problem

77f0e21

jpaillard reviewed Sep 4, 2025

View reviewed changes

examples/plot_partial_dependence.py Outdated Show resolved Hide resolved

examples/plot_partial_dependence.py Show resolved Hide resolved

examples/plot_partial_dependence.py Show resolved Hide resolved

examples/plot_partial_dependence.py Show resolved Hide resolved

lionelkusch and others added 4 commits September 4, 2025 11:05

Apply suggestions from code review

1dc8fab

Co-authored-by: Joseph Paillard <[email protected]>

change parameter in pdp

3efe682

add label to x label for continuous value

f016a59

Merge branch 'main' into PR_pdp

1a117a4

lionelkusch added the API 2 Refactoring following the second version of API label Sep 9, 2025

lionelkusch changed the title ~~Add Partial Dependance Plot~~ Add Partial Dependence Plot Sep 12, 2025

Add Partial Dependence Plot #318

Are you sure you want to change the base?

Add Partial Dependence Plot #318

Uh oh!

Conversation

lionelkusch commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bthirion commented Aug 12, 2025

Uh oh!

bthirion commented Aug 17, 2025

Uh oh!

lionelkusch commented Aug 27, 2025

Uh oh!

bthirion left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bthirion left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bthirion Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

lionelkusch Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bthirion commented Sep 1, 2025

Uh oh!

lionelkusch commented Sep 2, 2025

Uh oh!

jpaillard left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lionelkusch commented Aug 8, 2025 •

edited

Loading

codecov bot commented Aug 11, 2025 •

edited

Loading