adding t-edm #834

mnabian · 2025-03-31T23:07:26Z

PhysicsNeMo Pull Request

Description

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.

Dependencies

mnabian · 2025-04-01T00:05:49Z

/blossom-ci

mnabian · 2025-04-01T00:06:42Z

/blossom-ci

mnabian · 2025-04-01T00:07:39Z

/blossom-ci

Alexey-Kamenev · 2025-04-01T00:29:07Z

/blossom-ci

mnabian · 2025-04-01T00:31:20Z

/blossom-ci

CharlelieLrt

I left a few comments and questions mostly about:

More thorough docstrings, including tensor shapes and expected signatures of callable arguments. There are now many samplers and preconditioners, with different signatures and so on, and it starts to get really challenging to use them properly.
Test coverage for new features

I have two major comments:

This PR introduces tEDMPrecond and tEDMLoss, but never actually uses them in the train.py. I want to make sure that this is not an omission and that is really the purpose?
I understand that for generation, the tEDMPrecond is only expected to work in combination with the deterministic_sampler. However, I believe the deterministic_sampler suffers from multiple bugs (mostly arguments mismatches), so I doubt that the combination tEDMPrecond + deterministic_sampler is currently usable as it is.

CharlelieLrt · 2025-04-01T19:16:29Z

examples/generative/corrdiff/conf/generation/base.yaml

IMO these parameters should go in the conf/sampler/ configs file rather than the generation

CharlelieLrt · 2025-04-01T19:18:38Z

examples/generative/corrdiff/generate.py

@@ -111,6 +111,15 @@ def main(cfg: DictConfig) -> None:
    else:
        logger0.info("Patch-based training disabled")

+    # Parse the t-distribution parameters


Checks unnecessary as they are already handled by the default kwargs of diffusion_step. Can we make sure that either:

APIs are robust and can handle default values

Be more explicit and enforce default values to always be specified in config files

CharlelieLrt · 2025-04-01T20:20:52Z

examples/generative/corrdiff/generate.py

@@ -164,6 +173,10 @@ def main(cfg: DictConfig) -> None:
            solver=cfg.sampler.solver,
        )
    elif cfg.sampler.type == "stochastic":
+        if use_t_latents:


If I understand correctly we don't support t-EDM + patch-based generation for now. Might be helpful to raise an error to avoid this combination

CharlelieLrt · 2025-04-01T20:22:24Z

physicsnemo/utils/corrdiff/utils.py

@@ -79,6 +79,8 @@ def diffusion_step(  # TODO generalize the module and add defaults
    device: torch.device,
    hr_mean: torch.Tensor = None,
    lead_time_label: torch.Tensor = None,
+    use_t_latents: bool = False,


Can we add a test to test the diffusion_step with use_t_latents=True?

CharlelieLrt · 2025-04-01T20:22:41Z

physicsnemo/utils/corrdiff/utils.py

@@ -79,6 +79,8 @@ def diffusion_step(  # TODO generalize the module and add defaults
    device: torch.device,
    hr_mean: torch.Tensor = None,
    lead_time_label: torch.Tensor = None,
+    use_t_latents: bool = False,
+    nu: int = 0,


These two options are not documented in the docstring

CharlelieLrt · 2025-04-01T21:36:12Z

physicsnemo/metrics/diffusion/loss.py

+
+    Parameters
+    ----------
+    P_mean: float, optional


I might be missing something, but this t-EDM loss is NOT supposed to be used with CorrDiff, right? The fact that this loss is introduced but never actually used probably does not help to understand, but if it is indeed supposed to be used with CorrDiff training, there should be a regression model somewhere in tEDMLoss.__init__, as it is the case for the classical CorrDiff ResLoss

CharlelieLrt · 2025-04-01T21:40:39Z

physicsnemo/metrics/diffusion/loss.py

+        net: torch.nn.Module
+            The neural network model that will make predictions.
+
+        images: torch.Tensor


Can we add the shape of the input/output tensors? I found it a little counter-intuitive that we don't actually output a loss, but a pixelwise squared difference (so it needs reduction by mean or sum afterwards)

Ideally, the docstring should also include the expected signature of the net argument. I realized that the signature of tEDMPrecond is not the same as other CorrDiff preconditioner like EDMPrecond, so not specifying the signature will inevitably lead to confusion

CharlelieLrt · 2025-04-01T21:57:11Z

physicsnemo/models/diffusion/preconditioning.py

+    auto_grad: bool = False
+
+
+class tEDMPrecond(Module):


Can we add a test for tEDMPrecond

CharlelieLrt · 2025-04-01T22:04:16Z

physicsnemo/models/diffusion/preconditioning.py

+        sigma_min=0,
+        sigma_max=float("inf"),
+        sigma_data=0.5,
+        model_type="DhariwalUNet",


Are we sure that we want DhariwalUNet to be the default architecture? I thought SongUNet would make more sense?
In any case, would it be possible to document the possible values for the model_type argument?
If for now we want to restrict it to DhariwalUNet we should remove this option.

CharlelieLrt · 2025-04-01T22:25:20Z

physicsnemo/models/diffusion/preconditioning.py

+            **model_kwargs,
+        )
+
+    def forward(


Note that this foward will not work with the current deterministic_sampler, as the condition kwarg will be ignored. See this snippet from deterministic_sampler.py:

if isinstance(net, EDMPrecond): # Conditioning info is passed as keyword arg denoised = net( x_hat / s(t_hat), sigma(t_hat), condition=x_lr, class_labels=class_labels, ).to(torch.float64) else: denoised = net(x_hat / s(t_hat), x_lr, sigma(t_hat), class_labels).to( torch.float64 )

Note: When working on other CorrDiff things I realized the deterministic_sampler is mostly broken. I am not even sure there is a single use case where it can work properly. I did not address these bugs because my understanding is that deterministic_sampler is almost not used at all.

mnabian added 4 commits March 31, 2025 22:59

adding t-edm

daa7a9f

backward compatibility

511a5ed

update readme

6b0cae3

update changelog

016724f

mnabian requested a review from CharlelieLrt April 1, 2025 00:05

mnabian self-assigned this Apr 1, 2025

Merge branch 'main' into main

543748e

CharlelieLrt requested changes Apr 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding t-edm #834

adding t-edm #834

mnabian commented Mar 31, 2025 •

edited

Loading

mnabian commented Apr 1, 2025

mnabian commented Apr 1, 2025

mnabian commented Apr 1, 2025

Alexey-Kamenev commented Apr 1, 2025

mnabian commented Apr 1, 2025

CharlelieLrt left a comment

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025 •

edited

Loading

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025

CharlelieLrt Apr 1, 2025

adding t-edm #834

Are you sure you want to change the base?

adding t-edm #834

Conversation

mnabian commented Mar 31, 2025 • edited Loading

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

mnabian commented Apr 1, 2025

mnabian commented Apr 1, 2025

mnabian commented Apr 1, 2025

Alexey-Kamenev commented Apr 1, 2025

mnabian commented Apr 1, 2025

CharlelieLrt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CharlelieLrt Apr 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnabian commented Mar 31, 2025 •

edited

Loading

CharlelieLrt Apr 1, 2025 •

edited

Loading