Skip to content

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Feb 22, 2025

This PR adds a systematic test suite and a check_estimator utility for pytorch-forecasting forecasters.

The interface checked is the current unified API across models. This may change in the future, but no changes to the API are made.

Work in progress.

@fkiraly fkiraly added the enhancement New feature or request label Feb 22, 2025
return all_train_kwargs, train_kwargs_names


def _integration(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at the _integration function in this line, I was wondering whether it can be modularized better by splitting it into smaller functions like prepare_data, setup_trainer etc. This might help debug errors flagged by the tests. If required, these smaller functions can be nested under a larger function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! Was thinking the same thing.

My suggestion was to sequence this though, since across files there are multiple copy-paste-variations.

So:

  1. refactor with the current _integration into a single file and loop, taking the variable parts
  2. then refactor into modular parts

Not sure if this is the best way - if we get stuck, we may want to try the other way round, since 2 might make 1 easier.

result = []
ROOT = str(Path(__file__).parent.parent) # package root directory

def _coerce_to_str(obj):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add these "coerce" functions to utils._coerce just to make it usable all over the module?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I thought about it and would say "no", because it assumes the presence of get_tag means we have a BaseObject descendant instance.

The utils._coerce makes no specific assumptions on inheritance.

I wonder how we could resolve this, maybe we move the get_tag logic out?

@phoeenniixx
Copy link
Member

phoeenniixx commented May 19, 2025

I have few comments:

  • what will be the difference between class _BaseObject and _BasePtForecaster? I mean can _BasePtForecaster not inherit directly from _SkbaseBaseObject, why do we need _BaseObject ?
  • The func _integration is mainly for the v1, I think we should write a similar func for v2, that also checks the metadata transfer etc. Basis can be the tests written for TFT in [ENH] EXPERIMENTAL: TFT model based on the new data pipeline #1812. (see test_model_with_datamodule_integration )
  • I assume the currently fixture-generator and _BasePtForecaster are already compatible with both versions, right? Or am I missing something?
  • Do we need to add tests for D1-D2 integration as well as suggested by @PranavBhatP (some if it can be seen in [ENH] EXPERIMENTAL PR: D1 and D2 layer for v2 refactor #1811)?
  • For this to make compatible for v2 with just need to make some changes to data_scenarios and _conftest, other than that, everything is independent of versions right?

I think this PR can be the basis of the tests for v2 that I'll be working on.

@phoeenniixx
Copy link
Member

One more doubt: Should we add some initialisation tests as well for the models or the integration tests are enough?

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 20, 2025

what will be the difference between class _BaseObject and _BasePtForecaster? I mean can _BasePtForecaster not inherit directly from _SkbaseBaseObject, why do we need _BaseObject ?

Good question - I thought we may later also attach metadata to layers, in which case the layers would also inherit from _BaseObject, forming the lowest common base object of everything in pytorch-forecasting. Which, I thought, would be nice if it is in the pcakage, from an architectural perspective ("single point of dependency").

The func _integration is mainly for the v1, I think we should write a similar func for v2, that also checks the metadata transfer etc. Basis can be the tests written for TFT in #1812. (see test_model_with_datamodule_integration )

Agreed, although I think there also should be smaller unit tests. Plus, we need to abstract out variable parts, possibly in more "test parameter getter" methods in the metadata class.

I assume the currently fixture-generator and _BasePtForecaster are already compatible with both versions, right? Or am I missing something?

The fixture generators are generic, and could be used (by inheritance) for any base class defined by the object_type tag. It will retrieve all classes with the specified object_type tag and run them through the tests, in a loop.

Do we need to add tests for D1-D2 integration as well as suggested by @PranavBhatP (some if it can be seen in #1811)?

I think that is a good idea. For v2, I would advise to add a get_datamodule function, so D2 and model can be tested together. For this, it would be good to have at least one D2 instance which is not decoder-encoder, to see whether the loop works properly.

For this to make compatible for v2 with just need to make some changes to data_scenarios and _conftest, other than that, everything is independent of versions right?

More things need to change:

  • the tests should work with v2, of course, with all changes that entalis
  • you will also need to take into account potentially varying D2, see above

I think this PR can be the basis of the tests for v2 that I'll be working on.

Yes, I would advise to branch off in a way that does not overwrite the v1 tests - which we will also need.

One way to filter could be using an additional object_type "forecaster_pytorch_v2".

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 20, 2025

Should we add some initialisation tests as well for the models or the integration tests are enough?

Yes, I think there should be init tests and unit tests as well. This PR just aimed at refactoring all the current tests - that is perhaps a good start, and we may like to complete it. For v2 though, which are written from scratch, I would strongly advise to add init and unit tests!

@fkiraly fkiraly marked this pull request as ready for review May 27, 2025 20:41
@fkiraly fkiraly merged commit aa3bc9d into main May 28, 2025
34 of 51 checks passed
@github-project-automation github-project-automation bot moved this from PR in progress to Done in May - Sep 2025 mentee projects May 28, 2025
fkiraly pushed a commit that referenced this pull request May 29, 2025
### Description

This PR fixes #1807 and stacks upon PR #1780. Builds upon the PR #1814
(closed due to complex commit history).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

3 participants