Optimizer step being called 2 times when using deepspeed

### System Info

In version transformers==4.57.3, and deepspeed==0.18.3,

in below screenshot, when accelerator.backward is called, the deepspeed backward internally calls engine.step which is performing optimizer step at gradient accumulation step 

The below snapshot is from trainer.py in transformers library

<img width="665" height="288" alt="Image" src="https://github.com/user-attachments/assets/a97f7a72-335a-494f-8b93-163aac363262" />

Also, inside trainer as well optimizer.step is called again post this backward at gradient accumulation step, attaching the below SS for reference.

<img width="611" height="62" alt="Image" src="https://github.com/user-attachments/assets/2b4c7b58-4df3-494e-b795-69d0faa611fd" />


So inherently in a single iteration it is doing two optimizer step which is wrong. Please update this bug.



### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

This bug is currently working as a feature

### Expected behavior

There should pe single step, but there are inherently two steps for optimizer which is wrong

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizer step being called 2 times when using deepspeed #45656

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Optimizer step being called 2 times when using deepspeed #45656

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions