Skip empty parameters in gradient reduction #7789

tohtana · 2026-01-18T03:14:50Z

#7736 fixed an issue with OnebitLamb NaN propagation. With the fix, the optimizer correctly filters out empty parameters, but DeepSpeed engine's gradient allreduce operation (which runs separately from the optimizer) still includes empty parameters' gradients.

This PR addresses the issue by skipping empty parameters (numel=0) in _get_gradients_for_reduction().

Empty parameters (numel=0) cause issues in gradient allreduce when using flatten/unflatten operations. The unflatten operation fails with shape mismatches because empty tensors can't be properly reconstructed from a flattened buffer. This fix skips empty parameters in _get_gradients_for_reduction() since they contribute nothing to gradient reduction anyway. Fixes test_onebit.py::TestOneBitLambEmptyParameters::test Signed-off-by: Masahiro Tanaka <[email protected]>

PKUWZP · 2026-01-18T05:50:24Z

deepspeed/runtime/engine.py


+            # Skip empty parameters (numel=0) as they contribute nothing to gradient reduction
+            # and cause issues with flatten/unflatten operations
+            if param.numel() == 0:


@tohtana Very clean fix! The only minor comment is that maybe we can add an explicit test for gradient reduction? It's optional though.

Thank you for the feedback. Can you elaborate what test you are suggesting? Run only gradient reduction?

tohtana requested a review from tjruwase as a code owner January 18, 2026 03:14

PKUWZP approved these changes Jan 18, 2026

View reviewed changes

tohtana mentioned this pull request Jan 18, 2026

Add full test suite workflow #7795

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Skip empty parameters in gradient reduction #7789

Skip empty parameters in gradient reduction #7789

Uh oh!

tohtana commented Jan 18, 2026

Uh oh!

PKUWZP Jan 18, 2026

Uh oh!

tohtana Jan 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Skip empty parameters in gradient reduction #7789

Are you sure you want to change the base?

Skip empty parameters in gradient reduction #7789

Uh oh!

Conversation

tohtana commented Jan 18, 2026

Uh oh!

PKUWZP Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

tohtana Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants