Derivative functionals by loliverhennigh · Pull Request #1545 · NVIDIA/physicsnemo

loliverhennigh · 2026-04-02T20:40:36Z

PhysicsNeMo Pull Request

Description

Various derivative functionals

greptile-apps · 2026-04-02T20:48:49Z

Greptile Summary

This PR introduces seven derivative functionals (uniform_grid_gradient, rectilinear_grid_gradient, mesh_lsq_gradient, mesh_green_gauss_gradient, spectral_grid_gradient, and two meshless FD ops) under physicsnemo/nn/functional/derivatives, each with PyTorch and optional Warp backends registered via FunctionSpec. The implementations, benchmark cases, and tests are thorough, but two backend-inconsistency issues in the unstructured mesh functionals need attention before merging.

P1 (mesh_green_gauss_gradient torch backend, _torch_impl.py line 208): the torch path silently propagates partial/incorrect gradients w.r.t. points through the geometry coefficients, while the warp path correctly raises ValueError("warp … does not support gradients w.r.t. points"). Switching backends changes numerical outputs without any error signal.
P1 (mesh_lsq_gradient torch backend, _torch_impl.py line 1133): same pattern—dx = neigh_points - center traces back to points, giving a partial gradient that is inconsistent with the warp backend's explicit return None for grad_points.

Important Files Changed

Filename	Overview
physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/_torch_impl.py	Torch implementation of Green-Gauss gradient; silently allows partial/incorrect gradients w.r.t. points, inconsistent with warp backend that raises ValueError.
physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/utils.py	Builds Green-Gauss geometry tensors; correct algorithm but creates O(n_cells×n_faces) per-element GPU tensor allocations inside a nested Python loop, which is a significant performance issue at scale.
physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/_warp_impl.py	Warp kernel implementation with correct forward/backward Green-Gauss kernels and autograd bridge; correctly raises ValueError for points.requires_grad.
physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/mesh_green_gauss_gradient.py	FunctionSpec class with warp/torch backend registration, benchmark cases, and parity tolerances; well-structured.
physicsnemo/nn/functional/derivatives/mesh_lsq_gradient/_torch_impl.py	Torch batched-lstsq LSQ gradient implementation; allows gradient flow w.r.t. points through dx, inconsistent with warp backend that explicitly returns None for grad_points.
physicsnemo/nn/functional/derivatives/mesh_lsq_gradient/_warp_impl.py	Warp kernels for 1D/2D/3D LSQ gradient with analytical 2×2/3×3 inverse in kernel and matching backward kernels; returns None for grad_points.
physicsnemo/nn/functional/derivatives/meshless_finite_difference/meshless_finite_difference.py	MeshlessFDStencilPoints and MeshlessFDDerivatives FunctionSpec classes; uses jaxtyping annotations inconsistently with other functionals in this PR.
physicsnemo/nn/functional/derivatives/meshless_finite_difference/_torch_impl.py	Torch stencil-point construction and central finite-difference derivatives; clean, correct implementation with good validation.
physicsnemo/nn/functional/derivatives/spectral_grid_gradient/spectral_grid_gradient.py	SpectralGridGradient FunctionSpec class; uses jaxtyping annotations inconsistently with sibling functionals; otherwise well-structured.
physicsnemo/nn/functional/derivatives/spectral_grid_gradient/_torch_impl.py	Spectral differentiation via FFT with correct wavenumber construction and mixed-derivative support; handles float16 upcast and Nyquist frequency correctly.
physicsnemo/nn/functional/derivatives/uniform_grid_gradient/uniform_grid_gradient.py	UniformGridGradient with 3-tier auto-dispatch heuristic (warp/torch_compiled/torch) and lru_cache for compiled variant; well-designed.
physicsnemo/nn/functional/derivatives/rectilinear_grid_gradient/rectilinear_grid_gradient.py	RectilinearGridGradient FunctionSpec with warp/torch backends for nonuniform periodic grids; correct nonuniform central-difference scheme documented with LaTeX.
physicsnemo/nn/functional/derivatives/init.py	Package init exporting all derivative functionals and FunctionSpec classes; complete and consistent.
test/nn/functional/derivatives/test_mesh_green_gauss_gradient.py	Tests torch accuracy on a linear field, warp/torch parity, and backward gradient parity; good coverage of the main paths.
benchmarks/physicsnemo/nn/functional/registry.py	Adds all seven new derivative functionals to the ASV benchmark registry; correct and complete.

Comments Outside Diff (3)

physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/_torch_impl.py, line 208-216 (link)

Inconsistent points.requires_grad handling vs. warp backend

The torch implementation silently allows—and partially computes—gradients w.r.t. points, because coeff is derived from points via build_geometry and the autograd graph is never severed. The warp backend explicitly raises a ValueError for this case ("warp mesh_green_gauss_gradient does not support gradients w.r.t. points"). When a user switches between backends, they will get either an incorrect partial gradient (torch) or a hard failure (warp) for the same input, with no warning in the torch path.

The gradient through coeff is incomplete: it captures the sensitivity of the face-area coefficients to point positions, but not the topological sensitivity (which neighbors a point has), making the gradient numerically incorrect for mesh-optimization use cases.
physicsnemo/nn/functional/derivatives/mesh_lsq_gradient/_torch_impl.py, line 1133-1148 (link)

Inconsistent gradient behavior for points vs. warp backend

dx = neigh_points - center traces back to points, so autograd will compute a gradient w.r.t. points through the lstsq solution. The warp backend (_MeshLSQGradientWarpAutograd.backward) returns None for the points gradient, meaning the two backends disagree on this fundamental behavior. A training loop that runs on CPU (torch) will silently produce a gradient for points; the same loop on GPU (warp) will produce zero for the same input—no error, just different numbers.

Consider guarding against points.requires_grad in the torch path, or at minimum document that gradients w.r.t. points are not supported and are silently zeroed in the warp path.
physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/utils.py, line 930-945 (link)

Per-element device tensor allocation inside nested Python loop

torch.tensor(verts, device=points.device, ...) is called once per face per cell inside a nested for cell_idx / for face_idx loop—O(n_cells × (dims+1)) tiny device allocations. On CUDA this triggers a device allocation and a host-to-device copy per iteration, making this extremely slow for even moderate meshes. The face_vertices tensor is pre-allocated; the inner assignment could be vectorised instead:
```
# Collect all (cell, face, vertex_idx) mappings in Python first
all_verts = [
    [int(cell[v]) for v in local_face]
    for cell in cells_cpu
    for local_face in local_faces
]
face_vertices = torch.tensor(
    all_verts, dtype=torch.int64, device=points.device
).reshape(n_cells, n_faces, dims)
```
This reduces allocations to a single torch.tensor call regardless of mesh size.

_{Reviews (1): Last reviewed commit: "Add spectral and meshless finite-differe..." | Re-trigger Greptile}

greptile-apps · 2026-04-02T20:49:03Z

physicsnemo/nn/functional/derivatives/meshless_finite_difference/meshless_finite_difference.py

+    @FunctionSpec.register(name="torch", rank=0, baseline=True)
+    def torch_forward(
+        points: Float[Tensor, "num_points dim"],
+        spacing: float | Sequence[float] = 1.0,
+        include_center: bool = True,
+    ) -> Float[Tensor, "num_points stencil_size dim"]:
+        """Dispatch stencil-point construction to the PyTorch backend."""
+        return meshless_fd_stencil_points_torch(
+            points=points,
+            spacing=spacing,
+            include_center=bool(include_center),
+        )


Inconsistent use of jaxtyping annotations across the PR

MeshlessFDStencilPoints, MeshlessFDDerivatives, and SpectralGridGradient use jaxtyping (Float[Tensor, "..."]) in their FunctionSpec.register-decorated methods, while MeshGreenGaussGradient, MeshLSQGradient, RectilinearGridGradient, and UniformGridGradient in the same PR use plain torch.Tensor. Consistent annotation style across all functionals in this module would improve maintainability.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

loliverhennigh · 2026-04-06T20:46:45Z

physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/__init__.py

@@ -0,0 +1,19 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023 - 2026 NVIDIA CORPORATION & AFFILIATES.


This is currently unused in the mesh module however I already had a really nice implementation of it and wanted to add it.

loliverhennigh · 2026-04-06T23:40:05Z

physicsnemo/nn/functional/derivatives/meshless_finite_difference/_torch_impl.py

+    )
+
+
+def meshless_fd_stencil_points_torch(


This is just used for generating the data for testing. Probably can remove honestly but ok...

loliverhennigh · 2026-04-06T23:48:30Z

physicsnemo/nn/functional/derivatives/uniform_grid_gradient/_warp_impl.py

+### Forward kernels (periodic central differences)
+### ============================================================
+
+


I might break this file up...

ktangsali

Looks great, thanks @loliverhennigh! I also tested the PhysicsInformer integration with this PR and we have everything needed for supporting that higher level abstraction as well. Thank you for addressing the mixed derivatives comment and getting it to feature parity with Sym's implementations.

I have added a few comments. Some comments (like the docstring one for functions) are applicable to all the different gradient methods but I have just added the comment to one of the functions / implementations. Kindly take a look.

Overall this is looking in a great shape! Don't forget to update the Changelog.

ktangsali · 2026-04-13T19:06:50Z

physicsnemo/nn/functional/derivatives/meshless_finite_difference/meshless_finite_difference.py

+        spacing: float | Sequence[float] = 1.0,
+        derivative_orders: int | Sequence[int] = 1,
+        include_mixed: bool = False,
+    ) -> Float[Tensor, "num_derivs num_points channels"]:


I think as Greptile also pointed out, there is some inconsistency in jaxtyping across the different gradient methods.

ktangsali · 2026-04-13T19:11:02Z

physicsnemo/nn/functional/derivatives/mesh_lsq_gradient/_warp_impl.py

+        neighbor_indices=neighbor_indices,
+        min_neighbors=int(min_neighbors),
+    )
+    if points.requires_grad:


From my understanding torch will support this right? Any specific reason why we cannot support this with Warp too?

ktangsali · 2026-04-13T19:11:27Z

physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/_warp_impl.py

+) -> torch.Tensor:
+    """Compute Green-Gauss cell-centered gradients with Warp kernels."""
+    validate_inputs(points=points, cells=cells, neighbors=neighbors, values=values)
+    if points.requires_grad:


Same comment here.

ktangsali · 2026-04-13T19:23:57Z

physicsnemo/nn/functional/derivatives/uniform_grid_gradient/_warp_impl/op.py

+        include_mixed=bool(include_mixed),
+    )
+
+    orig_dtype = field.dtype


I believe this is a common pattern across the kernels - which makes sense to me, but maybe we add it as a note somewhere? Something like:

Note: Warp backends internally compute in float32. Float64 inputs are accepted but derivative accuracy is limited to float32 precision.

Because we would be losing precision if someone decides for some reason to pass fp64 (which folks might to get more accurate grads)

ktangsali · 2026-04-13T21:27:51Z

physicsnemo/nn/functional/derivatives/mesh_green_gauss_gradient/mesh_green_gauss_gradient.py

+    _BENCHMARK_CASES = (
+        ("2d-tri-24x24-scalar", 24, 24, False),
+        ("2d-tri-36x36-scalar", 36, 36, False),
+        ("2d-tri-36x36-vector", 36, 36, True),


Seems like we don't have a 3D test here ? Would be good to test the 3D path too.

ktangsali · 2026-04-13T21:40:56Z

physicsnemo/nn/functional/derivatives/uniform_grid_gradient/uniform_grid_gradient.py

+    used: on CUDA, 1D/2D fields prefer ``torch``; 3D fields use a single-threshold
+    crossover (``torch`` -> ``warp``) as problem size
+    grows. Inputs requiring gradients prefer ``warp`` to use the explicit
+    custom backward kernels.


This would be the public facing API right? Is my understanding correct ? If so, can we add minimal examples to them? And maybe the parameters can be further expanded?

Examples -------- >>> field = torch.sin(torch.linspace(0, 2*torch.pi, 64)) >>> grads = uniform_grid_gradient(field, spacing=2*torch.pi/64) >>> grads.shape torch.Size([1, 64])

Because I see that we are rendering these functions in the docs...

ktangsali · 2026-04-13T21:51:07Z

CODING_STANDARDS/FUNCTIONAL_APIS.md

We should also add a CHANGELOG entry for this PR :)

loliverhennigh added 2 commits April 1, 2026 16:19

Add derivative functionals and focused derivative tests

2b872f5

Add spectral and meshless finite-difference derivative functionals

7ed1ea5

loliverhennigh requested review from ktangsali and megnvidia as code owners April 2, 2026 20:40

greptile-apps bot reviewed Apr 2, 2026

View reviewed changes

loliverhennigh added 7 commits April 2, 2026 15:20

Remove public meshless FD stencil functional

07985e5

Add full license headers to derivative package init files

7c0a253

Refresh derivative docs visuals and remove meshless workflow text

fe40394

Adjust Green-Gauss docs image to remove white arrow artifacts

b5374fc

Remove arrows from Green-Gauss derivative docs image

f35fe22

Remove meshless FD image from derivative docs

17c3371

Refactor derivative functionals and remove optional warp guards

70714c6

loliverhennigh requested a review from peterdsharpe as a code owner April 6, 2026 19:10

loliverhennigh commented Apr 6, 2026

View reviewed changes

Add second-derivative mode for grid gradient functionals

d4cd6ed

loliverhennigh requested review from CharlelieLrt and coreyjadams as code owners April 6, 2026 23:20

loliverhennigh commented Apr 6, 2026

View reviewed changes

Update functional docs ownership and meshless FD doc wording

e999fd5

loliverhennigh requested a review from NickGeneva as a code owner April 7, 2026 18:00

loliverhennigh added 2 commits April 8, 2026 15:58

Refactor derivative functionals and split Warp implementations

161e17b

Refactor derivative warp launchers and normalize test coverage

be88499

ktangsali approved these changes Apr 13, 2026

View reviewed changes

ktangsali mentioned this pull request Apr 13, 2026

Sym upstream pr1 #1567

Open

6 tasks

		@@ -0,0 +1,19 @@
		# SPDX-FileCopyrightText: Copyright (c) 2023 - 2026 NVIDIA CORPORATION & AFFILIATES.

		### Forward kernels (periodic central differences)
		### ============================================================

Conversation

loliverhennigh commented Apr 2, 2026

PhysicsNeMo Pull Request

Description

Uh oh!

greptile-apps bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Important Files Changed

Comments Outside Diff (3)

Uh oh!

greptile-apps bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ktangsali left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps bot commented Apr 2, 2026 •

edited

Loading