Skip to content

Conversation

@AVHopp
Copy link
Collaborator

@AVHopp AVHopp commented May 7, 2025

Replaces the custom IndexKernel construction with BoTorch's MultiTaskGP (which became possible due the added all_tasks argument).

@AVHopp AVHopp marked this pull request as draft May 7, 2025 07:40
@AVHopp AVHopp changed the title Tl benchmarking investigation Use Botorch MultiTaskGP for transfer learning May 7, 2025
@Hrovatin Hrovatin force-pushed the tl_benchmarking_investigation branch 2 times, most recently from 8fee382 to 88e1dfe Compare June 4, 2025 11:18
@Hrovatin Hrovatin marked this pull request as ready for review June 5, 2025 10:39
Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Hrovatin, here the first batch of comments

@Hrovatin Hrovatin requested a review from AdrianSosic June 6, 2025 14:40
@Scienfitz
Copy link
Collaborator

Scienfitz commented Aug 15, 2025

@Hrovatin would you consider abandoning this PR? I think if this topic is picked up again its better to start afresh (and only open a PR after investigations have concluded).

@Hrovatin
Copy link
Collaborator

@Scienfitz I would keep open as the main blocker for this was randomness in benchmarks. Since that may be solved now I would suggest running benchmarks again on the new HPC (need to confirm it is also reproducible there)

@Scienfitz
Copy link
Collaborator

@Hrovatin any update?

@Hrovatin
Copy link
Collaborator

Hrovatin commented Sep 9, 2025

No, I need to first set up testing on oneHPC to reproducibly benchmark - as that seems to be the only option to make fully reproducible. I will post update here once I have the results @Scienfitz

Copilot AI review requested due to automatic review settings September 12, 2025 11:22
@Hrovatin Hrovatin force-pushed the tl_benchmarking_investigation branch from 8ce5fba to bee32aa Compare September 12, 2025 11:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@Hrovatin
Copy link
Collaborator

Hrovatin commented Sep 17, 2025

@AdrianSosic @Scienfitz @AVHopp Update on the comparison of MultiTask GP from botorch and current kernel:

  • The results are not identical, but very close, except for michaelewicz (but it seems that variation is likely not significant here as well)
  • A concern: When using botorch multitask gp the hartman tl benchmark always fails due to ooo (when using at 0.05 but not 0.01 source data). I have not yet figured out why. Before investigating this we should probably make a call if we are ok with accepting some deviation from current main (named benchmarks-reproducibility-beforeBug on the plot) or not as if we decide we need 100% reproducibility anyways it also does not make sense to investigate any other issues further.
image

Copy link
Collaborator Author

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments, but we should discuss some of the points (in particular the one regarding multiple active values) internally first.

@Hrovatin Hrovatin force-pushed the tl_benchmarking_investigation branch from de81707 to 68a9c24 Compare September 25, 2025 07:13
Copy link
Collaborator Author

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be willing to approve - however, since this is technically my PR I can't

@Hrovatin
Copy link
Collaborator

Hrovatin commented Oct 2, 2025

Results after rebase:
Note:

  • Hartmann tl did not run for the new branch due to issues in local setup (shows only the main branch). But I tested that it runs successfully in actions
  • Reproducibility is in general not 100% (also when not using the tl code that was changed)
image

@AdrianSosic AdrianSosic force-pushed the tl_benchmarking_investigation branch 4 times, most recently from 5cfb366 to 7bb49d9 Compare October 6, 2025 08:50
@Hrovatin
Copy link
Collaborator

Hrovatin commented Nov 4, 2025

Note to myself: Have a look at meta-pytorch/botorch#2739 (comment)

Hrovatin and others added 25 commits November 10, 2025 10:45
@AdrianSosic AdrianSosic force-pushed the tl_benchmarking_investigation branch from 7a91412 to 9dc7606 Compare November 10, 2025 10:20
The active_dims argument can now be dropped due to #671
@AdrianSosic AdrianSosic force-pushed the tl_benchmarking_investigation branch from 9dc7606 to 8db6a0a Compare November 10, 2025 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants