Fix zero/division safety gaps in utility and inference paths by harshang03 · Pull Request #7855 · deepspeedai/DeepSpeed

harshang03 · 2026-02-17T18:05:38Z

Describe your changes

Added a shared non-zero divisor validator and wired it into group divisibility checks and inference ceil_div.
Added strict steps_per_output validation in ThroughputTimer so invalid values fail early instead of triggering modulo-by-zero at runtime.
Hardened HPU FP8 dequantization to reject zero or non-finite scales before inverse-scale computation.
Added targeted regression tests for groups, timer, inference utils, and HPU quantizer scale validation.

Screenshot or video (only for visual changes)

N/A

GitHub Issue Link (if applicable)

[BUG] Multiple missing zero-guards cause ZeroDivisionError / non-finite values across DeepSpeed (4 locations) #7838

Testing Plan

Explanation of why no additional tests are needed:
- Added focused unit tests that directly cover each reported failure mode and guard path.
Unit Tests (JS and/or Python):
- ./.venv/bin/python -m pytest tests/unit/utils/test_groups.py tests/unit/utils/test_timer.py tests/unit/inference/test_inference_utils.py tests/unit/ops/fp_quantizer/test_fp_quantizer_scale_validation.py
E2E Tests:
- Not run (change is utility-level and covered by unit tests).
Any manual testing needed?:
- No.

Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

Add explicit validation for divisor inputs in groups and inference utilities, enforce valid throughput report intervals, and reject invalid HPU dequantization scales to avoid ZeroDivisionError and silent inf/nan propagation.

Fix zero and non-finite guard handling in key math paths.

f8174af

Add explicit validation for divisor inputs in groups and inference utilities, enforce valid throughput report intervals, and reject invalid HPU dequantization scales to avoid ZeroDivisionError and silent inf/nan propagation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix zero/division safety gaps in utility and inference paths#7855

Fix zero/division safety gaps in utility and inference paths#7855
harshang03 wants to merge 1 commit intodeepspeedai:masterfrom
harshang03:fix/issue-7838-zero-guards

harshang03 commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

harshang03 commented Feb 17, 2026

Describe your changes

Screenshot or video (only for visual changes)

GitHub Issue Link (if applicable)

Testing Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant