Fixes for multi-diffusion by CharlelieLrt · Pull Request #1560 · NVIDIA/physicsnemo

CharlelieLrt · 2026-04-10T01:30:21Z

PhysicsNeMo Pull Request

Description

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.
If I am implementing a new model or modifying any existing model, I have followed the Models Implementation Coding Standards.

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

greptile-apps · 2026-04-10T01:33:58Z

Greptile Summary

This PR fixes several device-compatibility and DDP/torch.compile compatibility issues in the multi-diffusion training path. Key changes: reset_patch_indices and _CompiledPatchX now operate on the unwrapped MultiDiffusionModel2D (peeling off DDP/compiled wrappers via the new _unwrap_multi_diffusion helper); _compute_global_index propagates device from patch_indices to avoid cross-device errors; random.randint is replaced by torch.randint for GPU-native patch sampling with optional generator support; and RandomPatching2D.forward is rewritten from unfold+advanced-indexing to a torch.gather-based approach that is more compatible with torch.compile. All remaining findings are P2 suggestions.

Important Files Changed

Filename	Overview
physicsnemo/diffusion/multi_diffusion/losses.py	Adds `_unwrap_multi_diffusion` helper to peel DDP/compiled wrappers, and uses the unwrapped model for `reset_patch_indices` and `_CompiledPatchX`; constructor type annotations still declare `MultiDiffusionModel2D` but the logic now accepts wrapped variants.
physicsnemo/diffusion/multi_diffusion/patching.py	Replaces `random.randint` with `torch.randint` (GPU-compatible, supports PRNG generator), propagates device from `patch_indices` in `_compute_global_index`, defers `_global_index` recomputation lazily via a plain Python flag, and rewrites `forward` from `unfold`-based to `torch.gather`-based indexing.
physicsnemo/diffusion/utils/model_wrappers.py	Trivial: adds a blank line after the license header. No functional change.

Comments Outside Diff (2)

physicsnemo/diffusion/multi_diffusion/losses.py, line 228-229 (link)

Type annotation doesn't match new behaviour

The model parameter is still typed as MultiDiffusionModel2D, but _unwrap_multi_diffusion was introduced precisely because this constructor should now also accept DistributedDataParallel and torch.compile-wrapped models. Passing a DDP-wrapped model currently satisfies the runtime logic but violates the declared type, which will mislead type checkers and users reading the docstring.

The same applies to the docstring Parameters section (model : MultiDiffusionModel2D) — updating it to model : torch.nn.Module or model : MultiDiffusionModel2D | torch.nn.Module would keep the contract accurate.
physicsnemo/diffusion/multi_diffusion/losses.py, line 451-452 (link)

Type annotation doesn't match new behaviour

Same annotation mismatch as in MultiDiffusionMSEDSMLoss: the parameter is typed MultiDiffusionModel2D but the class now unwraps DDP/compiled wrappers. Consider updating to torch.nn.Module.

_{Reviews (1): Last reviewed commit: "Fixes for multi-diffusion" | Re-trigger Greptile}

physicsnemo/diffusion/multi_diffusion/losses.py

physicsnemo/diffusion/multi_diffusion/patching.py

pzharrington

LGTM!

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

CharlelieLrt · 2026-04-10T05:47:32Z

/blossom-ci

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

CharlelieLrt · 2026-04-10T06:32:12Z

/blossom-ci

CharlelieLrt · 2026-04-10T18:51:03Z

/blossom-ci

CharlelieLrt · 2026-04-10T20:51:03Z

/blosson-ci

CharlelieLrt · 2026-04-10T21:41:25Z

/blossom-ci

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

CharlelieLrt · 2026-04-11T01:01:55Z

/blossom-ci

Fixes for multi-diffusion

9f11d53

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

CharlelieLrt requested a review from pzharrington April 10, 2026 01:30

greptile-apps bot reviewed Apr 10, 2026

View reviewed changes

physicsnemo/diffusion/multi_diffusion/losses.py Show resolved Hide resolved

physicsnemo/diffusion/multi_diffusion/patching.py Outdated Show resolved Hide resolved

pzharrington approved these changes Apr 10, 2026

View reviewed changes

CharlelieLrt added 3 commits April 9, 2026 22:43

Re-generate multi-diffusion CI data

3872cf9

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

Merge branch 'main' into multi-diffusion-fixes

d118d39

Fixed getattr in multi-diffusion

545b1f8

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

CharlelieLrt enabled auto-merge April 10, 2026 05:47

Re-generate multi-diffusion CI data

a7e6459

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

CharlelieLrt added 2 commits April 10, 2026 09:43

Merge branch 'main' into multi-diffusion-fixes

25c8b41

Merge branch 'main' into multi-diffusion-fixes

7714489

Merge branch 'main' into multi-diffusion-fixes

1d0e827

Fix device for random patching indices

aeca495

Signed-off-by: Charlelie Laurent <claurent@nvidia.com>

CharlelieLrt added this pull request to the merge queue Apr 11, 2026

Merged via the queue into NVIDIA:main with commit a102e53 Apr 11, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for multi-diffusion#1560

Fixes for multi-diffusion#1560
CharlelieLrt merged 9 commits intoNVIDIA:mainfrom
CharlelieLrt:multi-diffusion-fixes

CharlelieLrt commented Apr 10, 2026

Uh oh!

greptile-apps bot commented Apr 10, 2026 •

edited

Loading

Comments Outside Diff (2)

Uh oh!

Uh oh!

Uh oh!

pzharrington left a comment

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CharlelieLrt commented Apr 10, 2026

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

Uh oh!

greptile-apps bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Important Files Changed

Comments Outside Diff (2)

Uh oh!

Uh oh!

Uh oh!

pzharrington left a comment

Choose a reason for hiding this comment

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 10, 2026

Uh oh!

CharlelieLrt commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps bot commented Apr 10, 2026 •

edited

Loading